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Abstract — This work presents a stochastic dynamic program- 
ming (SDP) algorithm that aims at minimizing an economic 
criteria based on the total energy consumption of a range 
extender electric vehicle (REEV). This algorithm integrates 
information from the REEV's navigation system in order to 
obtain some information about future expected vehicle speed. 
The model of the vehicle's energetic system, which consists 
of a high-voltage (HV) battery, the main energy source, and 
an internal combustion engine (ICE), working as an auxiliary 
energy source), is written as a hybrid dynamical system and the 
associated optimization problem in the hybrid optimal control 
framework. The hybrid optimal control problem includes two 
important physical constraints on the ICE, namely, an activation 
delay and a decision lag. Three methods for the inclusion of 
such physical constraints are studied. After introducing the SDP 
algorithm formulation we comment on numerical results of the 
stochastic algorithm and its deterministic counterpart. 

I. Introduction 

Electrified automotive powertrain technology is being 
greatly developed as an increasingly number of carmakers 
wish to adopt vehicles with an electrification of the pow- 
ertrain as a viable solution for reducing greenhouse gas 
emissions worldwide to meet stringer regulative legislation 
and consumers' demand. There is a strong effort of major 
constructors in order to deploy fully electric vehicles (EVs) as 
early as possible in some car market segments. This work 
focus on a specific class of EVs, namely, range extender 
electric vehicles (REEV). This study aims at synthesizing a 
supervising optimal control strategy of the range extender (RE) 
using information from the vehicle navigation system (NAV) 
as well as a statistical analysis of previously stored driving 
data. One important feature of this study is the incorporation 
of two RE's physical constraints - the execution lag and the 
decision delay - in the optimal control problem formulation. 
From the mathematical point of view, the system is modeled 
as a hybrid dynamical system in which discrete decisions may 
suffer from an execution delay and a decision lag. The next 
step is the formulation of a stochastic optimal control problem 
in this hybrid framework alongside a stochastic dynamic 
programming principle which is then used to obtain an optimal 
controller. 

In the literature, there are several works that deal with the 
synthesis of optimal - or sub-optimal - supervisory control 
strategies for a vehicle with a certain degree of hybridization. 
These works include heuristics such as model-based JT) or 
fuzzy logic algorithms Q. Attention is also given to adaptive 



control techniques p]. Works Q, |j5j use stochastic dynamic 
programming algorithms to synthesize control strategies that 
are not cycle-specific but do not include any information from 
the navigation system. 

This study seeks to synthesize an optimal control sequence 
of the ICE in order to minimize an economic cost criteria 
related to overall energy consumption. In this case, a rather 
realistic scenario is that there is no precise information about 
how the vehicle will behave - i.e. what its power demand will 
be - on a particular trip, even if there is some information 
available from a navigation system. 

Many modern control systems involve some high-level 
logical decision making process coupled with underlying low- 
level continuous processes |6|, [iT], Q, Q. Some of those are 
flight control systems, production systems, chemical processes 
and traffic management systems. The term hybrid stems from 
the different nature of the systems' evolution, continuous 
and discrete. Hybrid systems have some supervision logic 
that intervenes punctually between two or more continuous 
functions. An important share of this research seeks to extend 
the well known theory of continuous systems - for instance, 
the maximum principle pO) , p2) and the dynamic program- 
ming principle |13J - to hybrid systems. Several authors 
|14|, |15|, |16| propose different modeling frameworks for 
hybrid dynamical systems varying in degree of generality. 
This study bases itself on the quite general framework p6) . 
Many of the applications appearing in the literature can be 
viewed as their particular cases. Works | [T4) , p3) , p6) , 1 11 1 
make considerable effort towards modeling and simulating 
hybrid dynamical systems in both deterministic and stochastic 
cases. Continuous hybrid systems including decision lags and 
executions delays are studied in |18[ , |19J where a suitable 
treatment is given to describe these systems. 

The contribution of this article is the use of a stochastic 
speed model in an optimal hybrid control framework, incor- 
porating NAV information, that integrates the RE's activation 
delay and decision lag constraints. 

II. Power Management Strategy for Range 
Extender Electric Vehicles 

A. Range Extender Electric Vehicles 

This section discusses the model of the power management 
system of a REEV that is to be optimized and introduces the 
notation used throughout the document. A REEV is a vehicle 



that combines a primary power source - a HV battery - and a 
small dimensioned (powerwise) secondary power source - in 
our case an ICE. The traction (or propulsion) of the vehicle 
is performed by an electric motor connected to the vehicle's 
wheels. Both power sources may supply the energy demanded 
by the driver Additionally, since the model considers a range- 
extender electric vehicle type, it cannot rely solely on the RE's 
power to drive the vehicle. The architecture is that of a series 
hybrid electric vehicle, which means that the ICE is not me- 
chanically connected to the transmission. Instead, a generator 
transforms the mechanical energy produced in the ICE into 
electric current that can be directed towards the electric motor 
or charge the battery. Also, the powertrain components are 
represented by a quasi-static models, detailed in pO) , thus 
neglecting any transient response. Controls available include 
the turning on and off of the ICE and the power produced in 
it. Due the discrete nature of the switch on/off control and the 
continuous nature off the power control, the control variable 
is seen as a hybrid control. The controlled variables are the 
state of charge (SOC) of the battery and the ICE state on or 
off. A power management strategy (PMS) for a REEV is a 
control sequence that dictates the state of the ICE - on or off 
- and if it is on, how much power it will supply to the electric 
motor 

B. Stochastic Model 

The instant SOC evolution depends on the vehicle's instant 
power demand. To devise efficient PMSs, the control synthe- 
sizer needs information about future power demands of the 
system as close as possible to the real power demand to be 
requested. Hence, given a fixed route, since the control synthe- 
sizer does not rely on exact before-drive speed knowledge, the 
PMS must be robust enough to "absorb" most of the situations 
occurring in a real driving cycle. In this case presented here, it 
contains information about possible deviations from the speed 
suggested by the NAV. The NAV outputs useful information 
is in the form of a geographic route. After the driver has 
entered a geographic location as a destination point in the 
NAV, it suggests a route, calculated using some optimization 
algorithm. A working hypothesis is that the vehicle will follow 
this route suggested by the NAV. The route is an assemblage of 
links, all of which have an associated constant recommended 
cruise speed. The route segmentation is used as discretization 
of the optimal control problem. The index k ~ 1, - ■ ■ ,K refers 
to route links nodes. Each Unk has an entry point and an exit 
point, thus, for a route with K links we have K + 1 nodes. 
Index k indexes a variable at a link's entry node. Then, at 
the k-th link's entry point, one has the battery SOC Xk , the 
vehicle instant speed yk and the ICE state qk ■ In the same 
manner, the ICE controls will be applied at each link entrance, 
denoted by Uk, the ICE power, and Wk, the ICE on/off switch 
decision. 

Each link's speed depends on particular characteristics of 
the route segment considered, such as type, number of lanes, 
location and may be dynamic, depending on weather condi- 
tions or the hour of the day. The speed profile formed with the 



speed suggested for each route link the NAV speed profile. As 
a result of the disparity between the real driving speed - not 
known a priori - and the NAV speed profile, the knowledge 
on how much power will be instantaneously required by the 
vehicle in a car trip is not known. Let H,^^^ be the - finite 
and discrete - set of possible NAV properties values, consisting 
for instance of number of lanes, location and hour of the day. 
Then, for a property set rj G ^l^^^ , let ^l{r]) be the space of 
possible vehicle speeds. For example, if rj corresponds to a 3 
lanes link located on a freeway at 14 p.m., one possible speed 
set may be n{ri) = {90, 110, 130, 150}. For a fixed 77, let T 
be a (T-algebra over ^1{t]) and P a probability. Then, we define 
^ : $1(77) — > i^('7) to be a random variable over the probability 
space {n,,J^,'P){ri) - we denote by ^ the random variable 
and its particular realization, the utilization will be clear 
from the context. Given two sets of properties 77, rj' E fl^^^ 
(representing two neighboring links) and two speed values 
y € ^{rj), y' G we wish to define the probability of 

changing to speed y' from speed y. It is natural to consider 
that the transition probability from y to y' depends on 77 and 
77'. For all f], rj G fi^^^, we define a transition probability P 
between y and y' to be P{y,y') = P(^' = y' \ £, = 7/,?;, 77'). 
Such properties should be tailored to contain relevant useful 
information. For instance, one may not need to consider time 
frames 15' apart, whereas a distinction between rush hours 
and lower traffic periods might be a more interesting choice. 

C. Execution Delay and Decision Lag 

This section explains the motivation of including the exe- 
cution delay and the decision lag constraints in the model, as 
well as the method used to incorporate these constraints in the 
optimal hybrid control problem. Frequent switching of the RE 
is undesirable in order to avoid mechanical wear off of the 
RE and acoustic nuisance for the driver Decision lags impose 
that some minimum time is to be respected before turning 
the RE on and thus, avoid this issues. Also, the RE ability of 
delivering power to the vehicle's HV network is constrained 
by the catalytic converter's temperature. Indeed, the catalytic 
converter present in the ICE's exhaust pipe must undergo a 
warm-up period in order to achieve a satisfactory working 
point. While this is not the case, the RE must avoid functioning 
in any working point different than the minimal one - required 
for not stalling - and thus, cannot provide any auxiliary power 
to the system. The activation delay models that behavior. In 
this application, however distinct in nature, the decision lag 
and the execution delay have the same order of magnitude of 
about 120s. Thus, they are conveniently regrouped in only one 
delay/lag constant variable 6, expressed in time units. 

These constraints are introduced through a state variable at 
each link entry point, tk G [0, 5], representing a counting time 
since the last turn-RE-on decision. In a nutshell, if tk < S, no 
switches are available to the controller and no power can be 
issued from the RE. Whenever tk > S, switches are available 
and the RE can provide auxiliary power to the vehicle. 



D. Hybrid State and Controls 

A hybrid state is a state vector consisting of continuous as 
well as discrete valued variables. The continuous variables are 
the battery SOC E [0,1], the vehicle instant speed ijk G 
[0,1/max] and the counting time tk G [0,6], and the discrete 
variable is the ICE state e {0, 1}. Let Z = A" x 3^ x T x Q 
be the hybrid state space and Zk ~ {xk,yk, tk,<lk) G Z be the 
hybrid state at each link's entry point. Let W). be the discrete 
control applied to the system at link k. The discrete controls 
are used to switch between continuous modes of the hybrid 
system. They are valued in a discrete set >V(zfc) C {0, 1} in 
the case of a two mode system. The control Wk — means no 
particular order, simply leaving whatever mode is on active, 
while Wk — ^ represents a mode switch. We recall that only 
off— >on transitions are delayed in time. Mode transitions are 
decided at link k but are executed afterwards and between its 
decision and execution instants, no other switch order can be 
decided. Continuous controls Uk are valued in a continuous 
set U{zk). Let — [uk, Wk) be the hybrid control valued in 
A{zk) = U{zk) X W{zk), the hybrid control space. 

Definition 2.1: The control sequence (ai, -- ,ax) is 
an admissible hybrid control sequence if it satisfies the rela- 
tions (1) For (zi,--- ,zk) S , we have (ai,--- ,aA') G 
A{zi) X • X A{zk). (2) Let r = {fc | Wfe = 1} be the set 
of links in which a switch decision is made. Then, Vi G r, 
k > S. (3) tii+i > S. 

The first condition states that every hybrid control should 
be in the hybrid control domain. The second condition is 
the decision lag whereas the third one implies that no order 
that suffers from an execution delay can be decided and not 
executed. For a hybrid state z — {x, y, t, q) E Z, we can write 
the control space as follows: 



U{z) 
W{z) 



if t<5 or q^Q 

U{y) if t > (5 and 9 = 1 

if t<5 

{0,1} if t>5 



(1) 
(2) 



E. Hybrid State Evolution and Control Policies 

This section describes the evolution equations of the state 
vector components from a known value Zk at link k. The 
speed on the next link is a random variable > 0. The SOC 
evolution depends on the power produced by the ICE Uk and 
S^k, as well as on Xk and qk- The time since last activation on 
the next link depends on Wk, tk, £,k and on the link length dk, 
given by the NAV. The evolution of the discrete variable qk 
depends on decisions made in the last link. This is summarized 
as follows: 



Xk+i 
Yk+i 

Tk+i 

Qk+i 



f{zk,uk,^k,qk) 

j tk+ dk/^k if Wfc = 
[ dk/ik if Wk = 1 
9{(lk,Wk) 



(3) 
(4) 

(5) 

(6) 



Given a known state vector Zk, the next speed value is 
not exactly known, but it depends on the next link's set of 
properties rik+i given by the NAV. The transition probability 
P dictates the random evolution of the system state vector, i.e., 
the random process {Zk), where the Zk are -measurable 
for all fc > 1. An admissible control sequence depends on the 
state value at future times, which are unpredictable, and so 
are the future control domains. Nonetheless, one can define 
of a sequence of hybrid control laws, or a hybrid policy 
(q;o, • • • ,aT-i) where each ak is a function of the hybrid 
state into a hybrid control a^: 

Definition 2.2: A hybrid policy TT = (ai, • • • ,a_R-) is 
a sequence of functions a where 

a : Z^A il) 

z I— >■ a{z) = a. 

A policy TT is said to be an admissible hybrid policy wrt z if, 
given a initial state z, the sequence tt — {ai{z), • • • , aK{zK)) 
is an admissible hybrid control sequence for all Z2 , • • • ,zkE 
Z. Let n(z) be the set of all admissible hybrid policies with 
respect to the initial state z. 

As a consequence, if a policy vr e n(z) is admissible, 
the controls produced by it are an admissible hybrid control 
sequence. 

F. Cost Functions 

Given a hybrid state z and hybrid control a, let l{z, a) G IR 
be an instant cost function. It is well advised to make explicit 
the separation of the instant cost function in its continuous 
and discrete control associated components. Given a hybrid 
control a — {u, w), denote l{z, a) = ct(w) + p{w), where 
a is the continuous control cost component and p is the 
cost due mode switching. For (3 > 0, define a final cost 
function to be evaluated at the final state z ~ {x, y, t, k) to be 
^(z) = —fix. The final cost function (/)(•) is assumed to have 
a finite expected value. The parameter /3 works as a scaling 
factor adjusting the relative value of the electricity and fuel 
consumption and can be seen as reflecting the economic price 
of 1% of SOC relative to II of fuel. 

Definition 2.3: Given an initial state z and an admis- 
sible policy TT G n(z), the states Zi,--- ,Zk are random 
variables given by 



Z^ = z, 



,K. (8) 



(9) 



Zk+i = h{Zk,ak,(k), fc = 2 

where h abbreviates ([3]l-(|6|. For fc = 1, • • • , A', the expected 
total cost of the process (|8]l with policy tt starting at z is 
" K 

J{z,tt)=1E a{uk) + pjwk) ~ /3Xk+i 
.fc=i 

where the expectation is taken over the ^k and the Zk- 
Observe that even if some discrete orders need to wait some 
time before being executed, their cost is incurred at the 
decision stage. A policy tt* is said to be optimal if, 

J(z,7r*)= min J(z,7r). (10) 

7ren(z) 

i.e., it minimizes the expected cost (|9]l. 



G. Value Functions and Stochastic Dynamic Programming 
Algorithm 

To establish a dynamic programming principle it is nec- 
essary to extend the minimization problem ( fTO] ) for general 
initial conditions. 

The discrete random process departing from link fco^ given 
a initial condition z e Z, and an admissible hybrid policy 
TT e n(z) is solution of 

Zk+i = h{Zk,ak,^k), k ^ kor ■ ■ ,K + 1, 



Zkn — 



(11) 

K, the 



Given instant cost functions Ik for k = ko, - ■ ■ 
expected cost-to-go, function of the process ( [TT| i is 

" K 

C{z,ko,TT)^lE J2 '^(^k) + p{wk) ~ 13Xk+i ■ (12) 

_k—k(i 

Define the value function of the optimal control problem of 
finding a policy that minimizes the expected cost ( [T2] i to be 

v{z,kQ) — min C(z,fco,7r). (13) 

A stochastic dynamic programming principle can be obtained 
from ( pjj ). The next proposition allow the evaluation of the 
value function in a classical backward fashion. 

Proposition 2.1: The value function ([T3]i is calculated 
by the backward procedure: 

* v{zk+i,K + 1) = -i3xk+i; 
, For fc = ivT, 1, 

v{zk,k)^ min E[l{zk,ak,£,k) + v(Zk+i,k + 1)]. 

(14) 

Proof: The proof is derived from classical arguments. See 
for instance |I2TI. ■ 



H. Optimal Trajectory Reconstruction 

This section details the procedure for synthesizing an op- 
timal policy. Evaluating the value function for all values of 
a discretized state space and all links is the first step into 
finding a sequence of control decisions that minimizes (|9|. 
Then, having particular initial conditions, one can synthesize 
a control sequence using an optimal trajectory reconstruction 
algorithm. 

Algorithm 2.1: 1) Step 1: Value function initiaUza- 
tion. 

a) Set k^ K + 1, set (3 > 0; 

b) For all z — {x, y, t, q) £ set 



viz,K) 



f3xk if t ~ 5, 
oo \f t < 5. 



2) Step 2: Backward evaluation. 

a) For k = K, - ■ ■ ,1, for all Zk E Z"^ evaluate: 

v{zk, k) ~ min a{u) + p{w) + 

f(2fc+i,fc + i)p(?/fc,e). 



Algorithm 2.1 is used for evaluation of ( [T3] l in all state space 
and at all links, where Z"^ denotes a grid obtained from a 
discretization of Z. Once evaluating ( [T3| l in a grid of all state 
space and at all links, given a initial condition Zi = z, the 
optimal trajectory reconstruction is made as follows: 
Algorithm 2.2: 1) Step 1: Initialization, 
a) Set k ^ 1, zi = z; 

2) Step 2: Control decision. 

a) al = argmin„g_4(^)]E[cr(M) + p{w) + 
v*{h{zk,a,0,k + l)] ; 

3) Step 3: Random realization and state evolution. 

a) ^ = r ; 

b) Zk+i = h{zk,al,C) ; 

4) Step 4: Advance to next stage. 

a) k^ k + l ; 

b) If k = K + 1 terminate. Else, go to step 2 ; 
Here, v'^ is the interpolate of the value function on the grid 

Z* at Zk+i. 

III. Numerical Application 

This section discusses the results obtained with simulated 
PMSs using a suitable speed profile. Firstly, it states in what 
conditions the PMSs are synthesized and shows some of its 
main characteristics. Secondly, results of the performance of 
the calculated PMSs running in real speed cycles are shown. 
The performances of policies tt are discussed in the light of the 
economic gain relative to a pure EV strategy, i.e., a strategy 
that does not use the RE, namely 



J* 



-I3XK+1 + J2k=l ^i^k) + Pjwk) 

-pxf 



(15) 



where x^'^j_^ 



is the SOC when the RE is not activated. Also, 
the CPU time needed for policy evaluation is commented. 

In this section, three different ways of adding some 
lag/delay information in the synthesis of optimal controllers 
are analyzed. The first one is a rather classical strategy of 
penalization of the switch cost p by a factor A > 1. The 
second technique profits from the available NAV data in order 
to imbue the dynamic programming algorithm implicitly with 
time-related information - recall that the dynamic program- 
ming algorithm is based on a route segmentation, being thus 
space-related. Finally, the third method is the one given by 
algorithms 2.1 and 2.2 using the state variable t and not 
relying on any particular structure of the problem itself, thus 
having the advantage of being flexible and general. 

A. Deterministic PMSs and Simulations 

In order to establish a benchmark this section analyzes 
controllers synthesized using NAV data and simulated in its 
ideal case, i.e. a deterministic case where the vehicle speed 
through the route follows exactly the NAV suggested speed 
- and hence, is perfectly known. This result will then be 
compared to the same controller simulated in real driving 
conditions, using data record from the vehicle in order to 
assess the performance loss. 



As pointed out, the deterministic approach consists in as- 
suming that the driver will follow exactly the speed profile 
suggested by the NAV. Hence, the variable y can be removed 
from the hybrid state and the problem takes a deterministic 
form. For a fixed route, and thus a fixed NAV suggested speed 
profile, the value function is evaluated using the backward 



dynamic programming algorithm 2.1 where the expected value 
is dropped since it considers but one possible realization of the 
driver speed. Then, setting initial conditions, the controller is 
synthesized using algorithm |2.2| Throughout all simulations, 

/3 = 2. 

1) Penalization factor X: Because the intermittent fast 
switching is undesirable, one possible approach that aims at 
decreasing the number of switches of the RE is the switch 
cost penalization. This approach sets S = and includes 
a multiplicative penalty A > 1 at each switch of the RE, 
increasing the switch cost to Ap(-). 

The CPU time is of 26 seconds for all values of A. For 
A = 1 the system is in its nominal cost configuration without 
any lag/delay constraints. Indeed, the strategy has a better 
performance ( fTS] ) than when we add the penalty (cf table 
as one should expect. For A = 2, a compromise is 
achieved between the criteria obtained and the number of 
switches executed. The controller executes 8 switches but, 
by inspecting the trajectories, one can remark that they are 
not sufficiently apart to respect the decision lag constraint. 
For A > 2 the strategy does not change. It becomes optimal 
to turn on the RE right away and turn it off only near 
the end of the route because of the higher cost to restart 
it later The penalization technique in spite of reducing the 
number of switches effectively does not respect the control 
constraints needed in real applications. Indeed, even if there 
are fewer switches, they are not necessarily spaced enough in 
time to respect the decision lag. Additionally, in no case the 
execution delay constraint is respected. This is an expected 
behavior as the penalization technique does not supply any 
direct information about any of the control constraints to the 
value function. Despite the advantages - simple numerical 
implementation and fast execution time - this approach does 
not seem fit for the desired application. 

TABLE I 

Performance criteria for penalized deterministic controllers 

WITH (5 = 0. 





J* 


switches 


A = 1 


1.1432 


22 


A = 2 


1.0812 


8 


A = 20 


1.0534 


2 



2) Discretization grid adaptation: When dealing with a 
deterministic formulation, the vehicle speed y can be removed 
from the state vector. In addition, as the vehicle speed is 
known beforehand, there is an unique relation between the 
position of any point through the route and the time it will 
be reached by the vehicle. In this approach, that information 
is used for the discretization of the dynamic programming 
algorithm. Conveniently, the discretization is made in intervals 



of time At that are sub-multiples of the delay and lag time S, 
satisfying the relation S = mAt, m integer. Indeed, proceeding 
as such, as one evaluates the value function at a node k, there is 
implicit information about the time since the vehicle departure, 
which is kAt- 

Observing the different performance levels for the values 
of m considered, ranging from an improvement of 2.0% to 
5.6%, one can infer that the route discretization plays an 
important role in the performance level of the PMS. The major 
drawback of this approach is that, in spite of the constraints 
verification and the simplification obtained in algorithms 2.1 



and 2.2 the CPU time becomes rapidly rather large (cf table 
[njl. For that reason, such an approach is not judged fit for 
an application. Also, remark that even if the grid could be 
discretized in a more sophisticate fashion - e.g., making m 
vary to capture a more volatile speed in some sectors - such 
an approach would fail in the stochastic case, because the 
space-time transformation would not be available. Although 
this method cannot be applied in a stochastic scenario, it can 
still contribute to very relevant information concerning the 
route discretization. 

TABLE II 

Performance criteria and CPU time for deterministic 

CONTROLLERS SYNTHESIZED USING ADAPTED GRID. 





J* 


CPU time (s) 


m = 1 


1.0205 


5.28 


m = 2 


1.0563 


135.11 


m = 3 


1.0345 


402.72 


m = 4 


1.0515 


789.70 



3) General approach: This subsection presents the results 
obtained for a deterministic approach that incorporates the lag 
and delay constraints on the control by setting (5 = 120 and 
taking into account the state vector variable t. The relative 
performance level achieved in this case is J* = 1.046, which 
represents an improvement of 4.6% over a purely electrical 
strategy. The CPU time for the controller synthesis is 204 
seconds, which is about the same order of the timewise grid 
discretization with m — 2. However, when m = 2, the 
performance level is 5.6% better than an all EV policy and 
thus, better than the general approach. Again, this remark 
suggests that the route discretization plays a major role in 
the policy performance, as one might expect. 

B. Stochastic PMSs and Simulations 

The analysis of results of controllers synthesized for use 
with a real driving cycle is presented in this section. In 
this case, the vehicle future speed is known with certain 
probability. The value function is evaluated using algorithm 2. 1 
and a policy is synthesized using NAV speed data, both setting 
5 — Q and using a penalization factor A and setting 6 = 120s 
and including the time variable in the state vector Next, this 
policy is simulated using several recorded real speed profiles. 
Each simulation is associated with a relative performance 
criteria J* . The mean performance of the synthesized policies 
are grouped in table III Figure [T] also includes the standard 



Mean relative criteria comparison of stochastic PMSs (p=2) 



Fig. 1 . Compaiison between mean relative performance criteria for stochastic 
PMSs simulated using real data. 



deviation of the relative performances. When including a 
penalization factor A, fewer switching is observed and the 
ICE is on the most throughout most of the path. However, the 
activation and decision constraints are not properly taken care 
of by the controller, not being, thus, suitable for any realistic 
application. In average, the formulation including t as a state 
variable has a better relative performance level achieving an 
improvement of 3.9% over a purely electric mode as well as 
respecting the control constraints. 

TABLE III 

Performance criteria for penalized stochastic controllers 
with (5 = and controller with <5 = 120s. 





J* 


switchs (mean) 


CPU time (s) 


A = 1 


1.0339 


23.50 




A = 2 


1.0284 


16.50 


2320 


A = 20 


0.9926 


10.06 




S = 120 


1.0386 


19.25 


4964 



IV. Discussion 

This work presents a stochastic dynamic programming algo- 
rithm for synthesizing optimal power management strategies, 
suitable for range-extender electric vehicles. The stochastic 
model considers that along a geographic route the vehicle's 
speed is given by a mean value plus a random disturbance. 
The hybrid dynamical system framework is used to state 
an hybrid optimal control problem. The model considered 
presents two important features, namely, the utilization of 
information from the vehicle's navigation system the inclusion 
of physical constraints on the range-extender - the activation 
delay and the decision lag. Three methods for including the 
aforementioned control constraints are studied: penalization of 
the switching cost, discretization of the path in multiples of 
the delay/lag time and inclusion of a time state variable. As 
a conclusion, the only method capable of taking into account 
the delay/lag constraints in a stochastic scenario is the general 
formulation, including a time state variable. Results using real 
recorded data show that the inclusion of the time state variable 
allows an average improvement of 3.86% in the performance 
level when compared to a purely electrical strategy. 
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